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Traditional methods of reporting changes in student responses have focused on class-wide averages. 
Such models hide information about the switches in responses by individual students over the course 
of a semester. We extend unpublished work by Steven Kanim on "escalator diagrams" which show 
changes in student responses from correct to incorrect (and vice versa) while representing pre- and 
post-instruction results on questions. Our extension consists of "consistency plots" in which we 
represent three forms of data: method of solution and correctness of solution both before and after 
instruction. Our data are from an intermediate mechanics class, and come from (nearly) identical 
midterm and final examination questions. 
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I. INTRODUCTION 

At the University of Maine, we are investigating what 
conceptual, procedural, and epistemological tools stu- 
dents use when solving first-order, separable differential 
equations in the context of intermediate mechanics [TJ [5j . 
As part of this study, we are interested in how student 
responses to identical questions change over time. 

In physics education research (PER), we often com- 
pare student results on identical questions. So, for exam- 
ple, we have pre- instruction and post-instruction use of 
surveys like the Force Concept Inventory (FCI) [3] and 
Force and Motion Conceptual Evaluation (FMCE) [3|. 
Questions from an ungraded quiz might come back in 
only slightly altered form on later examinations. Many 
instructors use questions twice: one question from the 
midterms might show up on the final examination. Fur- 
thermore, we often claim that questions asked before 
instruction as "baseline" data for studying the level 
of student understanding are nearly identical to later, 
post-instruction examination questions (see, for exam- 
ple, many of the citations in ref. 5 ). We often compare 
students' responses on each question to then make claims 
about the effectiveness of a teaching intervention in the 
process. 

Changes in student performance (as from pre- to post- 
instruction tests) have typically been reported on a class 
average basis (see, again, many of the citations in ref. 
[S]). This method of reporting has been very useful in 
roughly gauging the overall knowledge state of the class 
when assessing the effectiveness of a different kinds of 
curricula [5J|7]. However, comparing class averages can 
only shed light on individual student response patterns 
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when the individual response changes like the group re- 
sponse. When class response patterns remain static, 
we know next to nothing about individual student re- 
sponses. To describe the space between these two ex- 
tremes, Bao [8j discusses the differences between report- 
ing the class average normalized gain and the average of 
individual students normalized gains and suggests meth- 
ods by which the differences in these two measures can 
yield information about the students under discussion. 
Further issues arise, for example static group data in 
which individuals give varying and internally inconsis- 
tent responses. Such differences can have meaning [9J. 

Others have attempted to address this inherent diffi- 
culty in reporting changes in student response. The goal, 
as introduced by Kanim (unpublished), was to indicate 
class-wide shifts from more correct to more incorrect re- 
sponses, without assuming that no students moved from 
incorrect to correct. Kanim developed the visually com- 
pelling escalator diagram (see figure [TJ that illustrates 
in a compact icon both how many students responded 
correctly or incorrectly to a particular question at two 
different times and how many students changed their re- 
sponse from correct to incorrect or vice versa [15] . The 
plot is read from left to right. The size of the blue and 
red regions on the left edge indicates how many students 
got the question correct or incorrect during the first ask- 
ing; the size of the corresponding regions on the fax right 
represent correct and incorrect answers during the second 
asking. While most students maintain their correctness 
state, some students who were initially correct get on the 
red "down escalator" and give an incorrect answer later, 
and some who were initially wrong go up the blue esca- 
lator. The width of the diagonal escalator lines indicates 
how many students changed their answer. 

Figure [l] shows three quahtative escalator diagrams. 
In the first, "Force/ Time" , more students change to cor- 
rect than incorrect, leaving a net positive change in cor- 
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FIG. 1: Caption for Kanim escalator graph. 



rectness. In the second and third, the total number of 
students answering correctly remains the same. This, ob- 
viously, requires that an equal number of students change 
their answers from correct to incorrect and incorrect to 
correct between the two administrations of the test, il- 
lustrated by the equal width diagonal lines connecting 
the blue (correct) and red (incorrect) regions. How- 
ever, in the "Trajectory" diagram, the number of stu- 
dents changing their answers is relatively small, whereas 
in the "Work" diagram, a relatively large number of stu- 
dents change their answers. This information would be 
obscured in a traditional tabular representation of the 
data. 

In an escalator diagram, the "red" answer can be ei- 
ther the most common incorrect response (leaving some 
"other" as white on the diagram) or it can be all incorrect 
answers lumped together (and the white area remains to 
help visualize the situation better). Van Deventer [T0| 
extended the escalator diagram representation to include 
many different responses to a question. Figure |2] shows 
an example from ref. |10| on student responses to 2-d 
vector addition questions in a physics context that came 
after nearly identical questions in a non-physics "math" 
context. In this paper, we take Kanim's philosophy and 
extend it even further to include additional information. 

Many kinds of problems in physics have more than 
one correct method of solution. For example, you can 
use both energy reasoning or force reasoning to correctly 
solve certain problems in mechanics. As researchers, we 
often wish to connect student solution methods with an 
analysis of the correctness of a solution: Are the students 
using method A doing better than those using method B? 

The topic discussed in this paper is air resistance, of 
interest because it contains the relatively simple analysis 
of a first-order differential equation. We asked identi- 
cal air resistance problems on a midterm and final and 
described student responses in two ways. In one de- 
scription, we coded the work for the technique the stu- 
dent used. In another description, we coded responses 
based on the type of mistakes students made. To com- 
pare these two complementary descriptions of individual 
student method, along with how these methods changed 
in time, required a more visual solution than the typi- 
cal table, and was too complex to make efficient use of 



escalator diagrams. This led to the development of con- 
sistency plots, a kind of 2-d escalator diagram. There 
are fully described later in the paper. 

In the following section, we discuss the exam ques- 
tion analyzed and our coding methodology, followed by 
a presentation of our results in a typical table format. 
In Section |III| we describe consistency plots in general 
and present a plot of our data. In particular, in section 
|III A[ we discuss specific student response patterns which 
are obscured by a tabular presentation of the data. We 
discuss some limitations of the consistency plot represen- 
tation in section UlIBI 



II. STUDENT RESPONSES TO A PHYSICS 
QUESTION 

One of the first problem types in mechanics that re- 
quires the solution of a first-order differential equation is 
the air-resistance problem for objects in a gravitational 
field. We present three years of data from an examination 
question that appeared both on a midterm early in the 
semester (immediately following instruction on air resis- 
tance) and later on the final examination. In 2006, air 
resistance appeared in the first few weeks of the course, 
following a brief review of introductory mechanics. Since 
about half of our students take a differential equations 
course in the math department concurrently (the other 
half have taken it previously) , we wanted to allow these 
students to encounter the technique of separation of vari- 
ables in a mathematical context before doing so in a 
physics context. In 2007 and 2008, air resistance was 
covered in the middle of the semester. 



A. The exam question and solution 

In 2006, the question (see figure |3| appeared in a 
midterm exam given in Week 4 of the course. In 2007 
and 2008, it appeared as a group quiz in the middle of the 
semester. Students in 2007 and 2008 worked on the prob- 
lem in small groups during class time, and then solved the 
problem again on their own in an individual, take-home 
component of the quiz. We present results from these 
individual responses rather than the videotaped group 
quizzes. We look in particular at students solutions to 
part c of the problem. We note that issues related to the 
videotaped group quiz are discussed in ref. [TT] and deal 
with issues related to coordinate systems, not discussed 
in detail in this paper. 

A correct solution requires several important steps. 
First, students must apply the correct coordinate sys- 
tem, with down as the positive direction and the origin at 
the launch height. Second, a coordinate transformation 
from a = dv/dt to a differential based on displacement, 
a = vdv/dx — l/2d{v'^)/dx, is required. In 2006, this 
transformation was part of the previous problem on the 
exam, providing a hint for students who might get stuck. 
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FIG. 2: Expanded Escalator Plot. By including all categories of partially incorrect responses (most egregious at the bottom), 
we gain more understanding of student responses on nearly identical questions (see ref. Ji) for more details). 



You are at the top of the building with a beach ball. It is of a size that only the quadratic velocity air 
resistance term exists, 
F = -cv^. 

You throw the ball vertically downward. Because it's convenient, you choose down to be positive (so 
the ball is traveling in the positive direction). You observe that the ball slows down as it travels 
dovraward, eventually seeming to move at a constant speed until it hits the ground. 

a. Write the equation of motion of the ball before it hits the ground. 

b. What is the ball's speed when you observe it moving at a constant speed? 

c. Find the equation for the velocity of the ball as a function of height. 



FIG. 3: Problem statement. Only the vertical motion part is shown. 



Once the transformation had been applied and variables 
separated, the problem is simplified with a u-substitution 
that allows the w-integral to be solved with relative ease. 
Finally, students had to either choose the correct lim- 
its of integration {limits method) or add an integration 
constant and find its value ( + C method). Both methods 
are correct, of course. An example solution of only the 



mathematical steps involved in using the limits method 
is shown in figure [4| 

In this problem, the initial velocity was not given ex- 
plicitly and had to be defined by the students as vq or 
something similar. Also, the question asked for students 
to give the velocity as a function of height (as measured 
from the top not a typical definition of height!) even 
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FIG. 4: Problem solution using limits method. Explanations 
have been omitted for brevity. 



though they had just found the terminal velocity. Al- 
though this might have impaired some students reason- 
ing, evidence from video recordings of the group quizzes 
(not discussed in this paper) suggests that this atypical 
definition was not generally problematic for the students, 
who seemed to use height and distance interchangeably. 

The same question was given unchanged on the final 
examination in 2006 and 2008. In 2007, the question was 
changed slightly. Students were asked to find the veloc- 
ity as a function of time rather than position, and the 
air resistance force was v- and not w^-dependent. More 
consequentially, students were to consider up as the posi- 
tive direction but to consider both upward and downward 
motion. This slight perturbation led to a large effect in 
the data, creating the necessity of sometimes excluding or 
recording separately the results from this year. We will 
indicate each time this is the case. The dependence of 
student reasoning on simple changes to coordinate sys- 
tems is an interesting point, and consistent with other 
results from our studies on student use of coordinates 
and coordinate systems, but outside of the scope of this 
paper [TTJ [T^l [131 Ej • Additionally, because we are dis- 
cussing changes in individual student responses, we will 
consider only those students (in all years) for whom we 
have matched data. Of the 39 students who took both 
exams, six students had a completely correct solution on 
the midterm and seven were completely correct on the fi- 
nal. We note that solutions to differential equations were 
not a large part of the course content in the last third of 
the semester. We present a more detailed analysis of the 



results in the following section. 



B. Tabular presentation of student answers 

We have gathered data from three years of instruc- 
tion. Solution method categories arose from the data, 
being those methods most commonly used. Most stu- 
dents used separation of variables to solve the DE. These 
students were placed into one of the three groups: the 
limits method common to physics classes, the +C method 
more common to math contexts, and a both limits and 
+ C method in which students applied both methods to 
the problem. Those students who did not use separation 
of variables (including two students who applied the more 
complex technique of variation of parameters with an in- 
tegrating factor) as well as students who used separation 
of variables but did not use limits or an integration con- 
stant (essentially, only finding the anti-derivatives) were 
categorized as "other" solutions. As might be expected, 
most students who did not use separation of variables 
were unable to progress far into the problem. 

We also grouped solutions according to correctness. 
"Correct" solutions were carried out mathematically cor- 
rectly and related the mathematics to the physical mean- 
ing by successfully using the initial conditions in the fi- 
nal statement of the solution. Problems with a "math 
error" could include simple algebraic mistakes or might 
progress to difficulties using a w-substitution. Students 
who failed to consider the domain of the natural log func- 
tion were also placed in this group, even though the rea- 
soning required to determine the sign of the argument of 
the In function was physical p^. Those students with a 
"boundary condition error" might use unphysical limits, 
set Wo = later in the problem, or leave the integration 
constant undefined, simply as +C. Finally, students in 
the "other" category either did not come to a final an- 
swer or made a serious physical error while setting up the 
problem (such as writing terms in the equation in a way 
that contradicted the given coordinate system pTI). 

Our original goal in this analysis was to observe how 
students used boundary conditions to give physical mean- 
ing to mathematical statements. Therefore, we defined 
the "boundary condition error" category very narrowly 
and defined the "math error" and "other" categories 
quite broadly. For our specific purposes, math errors 
of any type are less serious than boundary condition er- 
rors because the latter depend on the students' physi- 
cal reasoning abilities while the former may simply be 
evidence of a careless math error. Different research 
goals (say, on the use of coordinate systems and their 
role in translating physical systems into mathematical 
statements [TTJ [T^ |T3J E]) would have led to different 
categories. 

Major results of these groupings are shown in tables |I] 
andini 
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TABLE I: Student solution techniques. Comparing student solution methods (39 matched students) on a question given on a 
midterm and final 
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TABLE II: Student correctness. Comparing student correctness (39 matched students) on a question given on a midterm and 
final. 



C. Apparent conclusions: A short interlude 

Looking at Table |T] we see that when all years are 
taken into consideration (as would typically be done for 
a course with such low yearly populations and the same 
instructional techniques), the methods students use re- 
main fairly static, with only a change of one or two stu- 
dents per category. Though we emphasize using the lim- 
its method in the course, many students seem to stick to 
the more familiar -|-C method that they originally learned 
in physics. 

Table [ll] tells a slightly different story, but one that is 
unfortunately familiar to many instructors: few students 
moved into the correct category from the midterm to the 
final exam, but, in contrast, many students' solutions 
change so much that they must be categorized as "other." 
While the number of math errors decreases, the number 
of boundary conditions errors remains fairly constant. 

An observation might then be: 

Instruction did not significantly change the 
method students use to solve air-resistance 
problems. While on both midterm and final 
about 15% of students are correct, more than 
twice that many make boundary condition er- 
rors, indicative of a disconnect between the 
mathematical method of separation of vari- 
ables and the physical reality it represents. 
Regrettably, students with an unacceptably 
flawed (categorized as "other") solution in- 
crease from 10% to 45%! 

From these observations, we might pursue several dif- 
ferent conclusions. Perhaps the group work preceding 
the written tests in 2007 and 2008 affected midterm 



performance and lowered the "other" responses on the 
midterm. Perhaps on the final in 2007 and 2008, stu- 
dents simply had too much to study for and were more 
careless in setting up problems. Other conclusions are 
available, as well. 



Interesting as these hypothetical observations and con- 
clusions might be, more fundamental concerns should be 
addressed first. The tallies shown in Tables U and |TT] 
are incomplete because they cannot tell how the solu- 
tions of individual students change over the course of the 
semester. Are those the same five stalwart students get- 
ting the right answer on the midterm and final in 2008, 
or is the old guard getting replaced with a fresh crop? In 
2006, fewer students used the "limits and +C" method 
on the final, but what happened to those students who 
abandoned ship? 



A multitude of escalator plots might be created to ad- 
dress these concerns and questions. Answers have serious 
consequences when considering how to improve teaching. 
If students who start correct stay correct, we can spend 
our time focusing on those who need the most help. If 
students change their answers, then we have to worry 
about how to help them stay where we wish them to stay. 
Answers also have serious consequences when considering 
our research. If students aren't answering consistently, 
can we fairly use results from a single test or observa- 
tion? How are we to evaluate students fairly? Some of 
these concerns are addressed and made clearer when the 
data are presented in a consistency plot. 
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FIG. 5: Consistency plot showing changes in student midterm 
and final exam responses to identical questions in both 2006 
and 2008. 
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FIG. 6: Sample consistency plot. Separate examples of voids, 
starbursts, and attractors are shown. 



III. CONSISTENCY PLOTS AS 
MULTI-DIMENSIONAL ANALYSIS 

To show how student responses change over time, we 
introduce a graphic we call a consistency plot. We draw 
a, 2-d grid representing two separate ways of analyzing 
student responses. Each student whose answers are dif- 
ferent from midterm to final is shown by an arrow formed 
of circle, line, and triangle. A circle represents an ini- 
tial response; a triangle represents a final response. This 
circle-line-triangle grouping is referred to as a response 
pair and describes the two solution states of a single stu- 
dent at two different times. A square represents a re- 
sponse pair in which the initial and final responses are 
identical. Both the size of the element and the number 
placed within it reveal how many students had that par- 
ticular response pair. 

The plot axes can represent any two lines of analysis of 
the data; in our case (see figure [sj , we use the previously 
discussed categories of solution method and correctness. 
We have arranged the axes to place those methods we 
find "less favorable" closer to what would be the origin 
in a canonical coordinate system (i.e. the lower left). We 
place the limits method further to the right because it 
gives us analytical power that the -|-C method obscures. 
For example, when testing the limits of certain constant 
values in the solution of an integral, the limits method 
makes the functional form of the equation more apparent 
sooner than the -|-C method. But, in situations such as 
the "correctness" axis, where categories are not mutually 
exclusive, we place student responses on the plot accord- 
ing to the most egregious error. For example, a student 



with both a math error and a boundary condition error 
appears in the boundary condition error region of the 
plot. More discussion of this hierarchical coding scheme 
is found in Section IIIBI 

Figure |5] shows a consistency plot of years 2006 and 
2008. (2007 was omitted due to the difference in ini- 
tial and final question statements and will be discussed 



in Section III B I The plot shows a considerable variety 



in student responses. Most response pairs describe indi- 
vidual students; only five response pairs are made up of 
more than one student: three students use limits and are 
correct on both midterm and final; similarly, three use 
limits and have a math error. Three pairs of students 
change responses in an identical manner. In total, of the 
29 student response pairs in 2006 and 2008, 22 (ca. 3/4) 
are distinct. This variability in student responses indi- 
cates that there is much more richness than indicated in 
Tables HI and ini 



A. Elements of consistency plots 

It is in the nature of consistency plots to be at least 
somewhat messy; the richness of information that they 
present can look chaotic at first glance. However, pat- 
terns do emerge from the plots. Figure [6] shows an ideal- 
ized consistency plot that contains four elements we have 
found in our data. 

• In the upper right corner of the plot, the region 
(limits, correct) is attractive, that is, many more 
response pairs end there than begin there. 

• The region (limits and +C, boundary condition er- 



7 



Correct 








Math 

Error / 






(2) 


Boundary 
Conditioo, 
Error 7^ 


fl 










y 

Other k 


©- 











Other Limits +C Limits 

and +C 

FIG. 7: Circulation pattern. Students flow out of and into a 
region on the consistency plot. 



ror) is an example of a starburst, the opposite of an 
attractor. 

• To the far left we see circulation, where two regions 
are connected by opposite response pairs. 

• Finally, in the bottom right, there is a void, a region 
with no responses. 

Our idealized consistency plot allowed us to clearly in- 
troduce important grouping of student responses, sepa- 
rating each grouping so that they could be clearly identi- 
fied. As is apparent from our plot of actual data in figure 
[5] real life is not so accommodating. Below, we present 
examples of these elements from our 2006/08 consistency 
plot. For clarity, we will show each element present on 
a separate plot, along with comments regarding the im- 
plications of each element for this particular question. 
Before reading on, however, the reader may find it inter- 
esting to return to figure [5] to find an example of each 
grouping. 

Figure [7] shows three examples of circulation, that is, 
sets of opposite response pairs. A table-like approach 
to the data would state that, for example, one student 
on the midterm and final used an "other" method while 
making a boundary condition error. In fact, our consis- 
tency plot shows that this data point is made up of two 
students, each of whom at one time or another was also in 
the (other, other) category. Similarly, while a table would 
show a net fiow (of one for the data presented on this 
sub-plot) into the "other, other" category it would not 
show that three students improved their responses. Cir- 
culation is the 2-D version of Kanim's escalator diagram. 
Examples of circulation of are of interest since these are 
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FIG. 8: Attractor pattern. Students move toward one region 
on the consistency plot. 



the changes in response that are absolutely undetectable 
in a traditional reporting of data, and are often ignored 
when matched student responses are not considered. 

In reality, there was much more fiow into the (other, 
other) region than out (see figure [8|. We call such a 
region an attractor. In our idealized plot, no responses 
left the region; in this real world example, three leave, 
giving a net fiow of four students. Seven students found 
themselves in this category on the final. Yet, all but 
one of the midterm solutions used a separation of vari- 
ables technique, and three contained only math errors. In 
other words, most of the students who performed badly 
on the final gave little forecast of this eventuality on the 
midterm. 

Although we would much rather see the (limits, cor- 
rect) region be the attractor, (as it was on our idealized 
plot, figure |6]) this attractor makes visual the unfortunate 
case of students who seemed to have it together during 
the course falling apart on a final examination that cov- 
ers many topics. These results call into question the ways 
in which we use individual (non-linked) midterm and fi- 
nal examinations to make claims about what students do 
and do not know. 

Two small starbursts appear on our 2006/08 consis- 
tency plot (see figure |9]), with origins in (-I-C, boundary 
condition error) and (limits and -|-C, boundary condition 
error). In each case, two important aspects are appar- 
ent: no students enter the region, and the students that 
leave go in all directions, rather than to another specific 
region. This implies that these combinations of methods 
and errors are the result of unrefined ( "everything plus 
the kitchen sink" ) thinking about the problem. Had the 
students all gone in the same direction, we might instead 
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FIG. 9: Starburst pattern. (Nearly all) students leave a given 
region on the consistency plot. 



infer that the indicated region can be thought of as a 
stepping-stone toward a more refined solution. 

We can see the difference between the "kitchen-sink" 
and the "stepping stone" interpretation when we look at 
the starburst in the (limits and -|-C, boundary condition 
error) region. Three students use both limits and an in- 
tegration constant in their solutions on the midterm with 
two changing their methods on the final. When we first 
ran across this method on the midterm, we interpreted it 
as a bridge or stepping stone between the use of integra- 
tion constants, the mainstay of math classes, and integra- 
tion limits, modeled by the physics instructor. Although 
mathematically sketchy (since, of course, limits and an 
integration constant serve the same mathematical role), 
we thought that it was simply a sign of students becom- 
ing used to the limits method and including it in their 
solutions without yet recognizing that the two methods 
were mathematically equivalent. We suspected that if 
this were the case, these students would later solely use 
the limits method as they became more comfortable with 
it. Had our assumptions been correct, these three stu- 
dents would have all moved toward the (limits, correct) 
region. They did not. The starburst indicates, instead, 
that these students were, on the midterm, simply throw- 
ing all the techniques available to them at the problem, 
and did the same on the final. The one student who 
does switch to the limits method makes errors substan- 
tial enough to be considered "other." 

Finally, we consider in figure [TO] the highlighted void 
region where few student responses are located. There 
are 21 solutions in the region below the void, represent- 
ing students who used a method other than limits and 
either did not successfully incorporate the boundary con- 
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FIG. 10: Void pattern. (Nearly) no answers are found in an 
extended region of the consistency plot. 



dition into the problem or were unable to appropriately 
represent the physical situation mathematically. Of 29 
students over two exams in two years, only two (a total 
of three times) use an integration constant and find its 
value using the boundary condition. No students cor- 
rectly used a method other than separation of variables, 
though other appropriate techniques exist. One might 
recall the form of the solution of the differential equa- 
tion or use the more complex technique of variation of 
parameters. Of course, zeros show up in tables, but the 
visual nature of the consistency plot makes the regions 
that contain few students as compelling as the regions 
that contain many students, and leads us to consider 
what students are not doing along with what they are 
doing. Since we are looking at how students bring physi- 
cal meaning to the mathematics, the fact that only two of 
29 students use a method other than the limits method 
to correctly solve the problem is noteworthy and sug- 
gests several pedagogical pathways toward helping stu- 
dents learn the physics (and mathematics) better. 

B. Difficulties with a New Representation 

As we have shown, our representation can allow a 
deeper analysis of data than a simple table. However, 
it is not without difficulties. As an example, we present 
a consistency plot of student responses during the year 
2007. In this year, the problem was changed on the final 
so that the positive direction was up (in figure [s] it was 
down). Our goal in reversing the direction was to give a 
subtly different problem from the midterm, so that mem- 
orized responses couldn't be used as readily. The change 
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FIG. 11: Consistency plot from 2007. Final examination 
question had a differen coordinate systems, leading to many 
"other" responses that hide information from further analysis. 



in coordinate system leads to a change in the definition of 
the initial velocity of the downwardly thrown ball, among 
other things. Since, by convention, physical constants are 
positive, students needed to define the initial velocity of 
the ball as —vq. The change also caused serious problems 
for students not setting up the problem correctly; their 
translation of the physical system into coordinates often 
included the wrong signs. Such errors were considered 
"other" errors (in keeping with our previous definitions) . 
Because of the seriousness of the error, students making 
"other" errors would not be counted as making "bound- 
ary condition" or "math" errors. 

An initial perusal of figure [TT] suggests that only three 
students ended in the "boundary condition error" row 
of the diagram. However, the hierarchical nature of our 
vertical axis obscures the fact presented in table |ll] that 
nine students actually made the — uq error. Because seven 
students made errors in the setup and execution of the 
problem that were severe enough to classify as "other," 
we lose information about the prevalence of boundary 
condition errors in the 2007 consistency plot. 

In part, this situation can be alleviated by careful at- 
tention to coding and the development of coding schemes. 
However, we cannot always predict in advance what the 
data might say and there is always the potential that a 
hierarchical coding scheme will conceal interesting pat- 
terns, so we do not recommend the abandonment of the 
data table. One might allow for fuzzy categories, but 
having one student in two regions seems impossible to 
interpret visually. One might create more detailed ex- 
clusive categories, but developing too many categories 



to eliminate a hierarchical axis and describe student re- 
sponses more fully may create a plot too chaotic to read. 
We already have very few students doing the same thing 
from midterm to final - to use a finer comb to separate 
the responses would make the data (including observa- 
tion of attractors and starbusts) nearly meaningless for 
small classes. The balance between inclusiveness of cate- 
gories and visual accessibility will be specific to each data 
set, and so must be determined on an individual basis. 
We have chosen to keep the 2007 consistency plot on the 
same scale as the 2006/08 plot, for example. 



IV. CONCLUSIONS 

We have described consistency plots, a new method of 
presenting student responses and how they change over 
time. As a visual presentation of data, they allow the 
recognition of patterns of student response that may not 
be available or easily discernable when the data is pre- 
sented in table form. As an example, we use the plots 
to discuss how students' solution methods and correct- 
ness for solving first-order separable differential equations 
change from a midterm to a final exam. 

We show several examples of interesting response pat- 
terns within the overall plot: circulation, attractors, star- 
bursts, and voids. Each is revealing about student per- 
formance in the class, yet most are hard to represent in 
simple tabular form. Circulation calls into question how 
one uses class-wide tables of data to imply improvement 
in student performance. Attractors can show where stu- 
dents are being led over the course of instruction. Star- 
bursts help us distinguish between solutions that indi- 
cate students developing toward new ideas (i.e., stepping 
stones) and solutions that indicate students' confusion 
about the problem. Voids indicate a "failure" to use some 
method correctly, but give little further information. 

We also note difficulties with the use of consistency 
plots. When the hierarchical categorization of data is 
employed, some patterns of student response may be ob- 
scured. The inability to use fuzzy categorization and re- 
quiring students to be in only one region is another prob- 
lem. Student responses are often richer than a simple plot 
can represent. Thus, consistency plots are limited by the 
researchers' assumptions and interpretations of the data. 
Our interests in this paper were in students giving phys- 
ical meaning to mathematical solutions. Different plots 
would have been drawn had we been analyzing how stu- 
dents translated a physical situation into mathematical 
statements. 

We believe consistency plots can be useful for display- 
ing data in a wide variety of situations. Whenever indi- 
vidual student responses to pre- and post test questions 
are being considered, it is likely that a consistency plots 
will be a useful addition to standard tables for interpret- 
ing patterns of student response. 
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